Last Bank: Dealing with Address Reuse in Non-Uniform Cache Architecture for CMPs

نویسندگان

  • Javier Lira
  • Carlos Molina
  • Antonio González
چکیده

In response to the constant increase in wire delays, Non-Uniform Cache Architecture (NUCA) has been introduced as an effective memory model for dealing with growing memory latencies. This architecture divides a large memory cache into smaller banks that can be accessed independently. Banks close to the cache controller therefore have a faster response time than banks located farther away from it. In this paper, we propose and analyse the insertion of an additional bank into the NUCA cache. This is called Last Bank. This extra bank deals with data blocks that have been evicted from the other banks in the NUCA cache. Furthermore, we analyse the behaviour of the cache line replacements done in the NUCA cache and propose two optimisations of Last Bank that provide significant performance benefits without incurring unaffordable implementation costs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Compile-Time Data Locality Optimization Framework for NUCA Chip Multiprocessors

With increasing numbers of cores, future CMPs (Chip MultiProcessors) are likely to have a tiled architecture with a portion of shared L2 cache on each tile and a bank-interleaved distribution of the address space. For data-parallel programming models, there is a mismatch between such a non-uniform cache organization and the canonical row-major or column-major layouts of multi-dimensional arrays...

متن کامل

3D Tree Cache – A Novel Approach to Non- Uniform Access Latency Cache Architectures for 3D CMPs

We consider a non-uniform access latency cache architecture (NUCA) design for 3D chip multiprocessors (CMPs) where cache structures are divided into small banks interconnected by a network-on-chip (NoC). In earlier NUCA designs, data is placed in banks either statically (S-NUCA) or dynamically (D-NUCA). In both SNUCA and D-NUCA designs, scaling to hundreds of cores can pose several challenges. ...

متن کامل

BP-NUCA: Cache Pressure-Aware Migration for High-Performance Caching in CMPs

As the momentum behind Chip Multi-Processors (CMPs) continues to grow, Last Level Cache (LLC) management becomes a crucial issue to CMPs because off-chip accesses often involve a big latency. Private cache design is distinguished by smaller local access latency, good performance isolation and easy scalability, thus is becoming an attractive design alternative for LLC of CMPs. This paper propose...

متن کامل

A Daptive Block Pinning Based : D Ynamic C Ache Partitioning for M Ulti - Core Architectures

This paper is aimed at exploring the various techniques currently used for partitioning last level (L2/L3) caches in multicore architectures, identifying their strengths and weaknesses and thereby proposing a novel partitioning scheme known as Adaptive Block Pinning which would result in a better utilization of the cache resources in CMPs. The widening speed gap between processors and memory al...

متن کامل

Critique for paper “ Optimizing Replication , Communication , and Capacity Allocation in CMPs ”

The paper presented an approach for extracting benefits of both unified cache and distributed cache approaches using Non Uniform cache architecture. ● The NUCA approach enables us to retain as many memory references on chip as possible, thus greatly reducing penalty due to off chip accesses ● The approach utilizes distance locality, thereby data is stored as close to the processor core as possi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009